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Abstract 

A single target is hidden at a location chosen from a predetermined probability distribution. 

> 

Then, a searcher must find a second probability distribution from which random search points are 



OS 



sampled such that the target is found in the minimum number of trials. Here it will be shown 
that if the searcher must get very close to the target to find it, then the best search distribution is 
proportional to the square root of the target distribution regardless of dimension. For a Gaussian 

target distribution, the optimum search distribution is approximately a Gaussian with a standard 

& . 

deviation that varies inversely with how close the searcher must be to the target to find it. For a 

O ' network, where the searcher randomly samples nodes and looks for the fixed target along edges, 

£>y the optimum is to either sample a node with probability proportional to the square root of the out 

Qh< degree plus one or not at all. 
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I. INTRODUCTION 



Imagine a single target hidden with some known distribution and a searcher trying to 
locate it by randomly drawing points from another distribution. Assuming the cost for the 
search only depends on the number of steps, an optimal searcher has to choose a distribution 
that minimizes the average number of steps to locate the target. This situation is different 
from the usual search paradigms in which there are many targets l|-|4| and/or no knowledge 
of where the target may be 5|, |6| . When there are many targets, the solution is to perform 
a random walk with a Levy flight, but that depends on expanding the search time in the 
distance between targets which is not possible when there is only one target (l|. Intermittent 
searches are optimal if there is no knowledge of the target and large steps cost more than 
small onesp, LZH9] . 

The goal here is to describe the optimum guess distribution to sample to find a target, 
given the target distribution and assuming the searcher must be within a distance R of the 



or 



arge steps. A familiar example 



111 ], such as a face in a crowd. 



target to consider it found and with no additional cost 
of this is searching with the eye for a particular target 10, 
Without any guess this is a very difficult proposition, but if someone who knows where the 
target face is gestures, even inaccurately, toward the target, then that clue greatly improves 
the search. In the case of the eye, the movement is very rapid compared to identification of 
the target 12J so that the search time is dominated by the number of places checked rather 
than, for example, the total path length traversed. Also, the searcher may not see the target 
on the first pass, so memory is not really important. In other words, actual identification 
of the target is unreliable so there is no reason to avoid a spot just because the search 
failed there a few times. Under these assumptions, it will be shown that the optimal search 
distribution is different than the target distribution, and specifically, if R is small compared 
to the variability of the target distribution, then the search distribution is proportional to 
the square root of the target distribution. 



II. CONTINUOUS SYSTEMS 



Now consider definitions of the relevant quantities for a continuous target distribution. 
Let G(x) be the unknown guess distribution, T(x) the known target distribution, and R the 



range over which the guess is considered to find the target. For example, if T(x) is a delta 
function so that the target is at x = 0, then the best guess is clearly also G(x) = 5(x) when 
R = 0. For non-zero R a guess anywhere on the interval [-R, R] is considered a hit, so the 
best guess distribution may be a boxcar from [-R, R] . For simplicity concentrate on one 
dimension, although generalizing to two (or more) is trivial. 

For a fixed target at position x and assuming the guesses are uncorrelated (memoryless) , 
the average number of guesses required before finding the target is just the inverse of the 
probability that a guess is correct. Averaging over all possible target locations gives 
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(n)= / dx JJ^ . (1) 

7-oc CndtG{t) 

Note that the the denominator in the integral over T(x) is non-local which is the source of 
some mathematical difficulties. 

The optimum (n) is found by taking the functional derivative of G subject to the con- 
straint that G be normalized: 



o = j4)(<»>[GW]-«(i-£*GW 



(2) 



where a is a Lagrange multiplier. To handle the non-local integral, define a function 
Q(t; x, R) such that 

px+R poo 

I dtG(t)= dtG{t)Q(t;x,R), (3) 

Jx—R J— oo 

i.e., Q(t;x,R) is a boxcar function of width 2R centered at x. Then, by the generalized 
chain rule for functional derivatives, or as may be verified from the definition of a functional 
derivative 

(n) = lim - ((n) [G(t) + eS (t - y)] - (n) [G(t)]) 



SG(y) not 

T(x)Q(y;x,R) (4) 



dx 



rdtG(t)Q(t;x,R) * 



By definition, Q(y;x,R) = 1 whenever \x — y\ < R. Thus, Q(y;x,R) = Q(x;y,R), and 
note that this is regardless of dimension as long as the norm is well defined. The functional 
derivative simplifies to 

6 'n) = - f V+R dx ^ j . (5) 
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Plugging equation [5] into equation [2] gives 
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But, a is a constant and this must hold for any y, so the integrand must be a constant, 
and, indeed, it must be a/2R, but we can rely on some constant A to handle normalization. 
Thus, solving the integrand for G(x) and noting that everything is positive (G and T are 
both probability distributions), 

px+R 

/ dtG(t)=A^T{x). (7) 

Jx-R 

Equation [7] provides the most general version of the main result: how the guess distribution 
G(x) depends on the target distribution T(x). 

Equation [7J has an interesting expansion near R — > where the integral of G is approxi- 
mately constant giving 



G{x) « A^/T{x), (8) 

where A takes care of normalization. Thus, in the limit that the range becomes small the 
optimal guess distribution is constructed by taking the square root of the target distribution 
and normalizing. This is somewhat surprising because at first glance the best choice for 
the guess distribution would have been to match the target distribution. The square root 
effectively widens the search distribution and lowers the chance of sampling at the most 
likely position. This is advantageous because the search is completely memoryless, so the 
most likely target location would be oversampled in a probability matching approach, and 
the searcher would waste time repeatedly sampling the same site. A simple example is to 
imagine just two discrete target sites, one with a 99% probability of having the target and 
one with 1%. If the searcher probability matches, then the mean time to find the target is 
(n) = 0.99/0.99 + 0.1/0.1 = 2, and the searcher finds the target in one guess 99% of the 
time and looks in the wrong spot an average of 100 times on 1% of the trials. This can be 
improved significantly by looking at the more likely target slightly less often, say 98% of 
the trials, making the mean (n) = 0.99/0.98 + 0.1/0.2 rs 1.5. The improvement comes from 
searching at the unlikely hiding spot slightly more often, but not often enough to impact the 
more likely case significantly. The square root rule finds the best trade off and generalizes 
to more complex, continuous situations. 



A nice feature to note here is that since G is actually calculated, we can at least approx- 
imate the actual value for the mean number of trials (n) before finding the target (equation 
d]). Thus, if the number of steps is too large, say greater than 2 (n) as human searchers 
seem to use [13j, either the assumed target distribution is likely invalid or the target is not 
present, and failure is recognized. Note that this is not the case for any Levy or intermittent 
search for a single target because their mean search times scale with the size of the search 



region 14j , so failure is difficult to recognize for those cases. 



To solve for G in the case when T(x) is differentiable, take a derivative: 



G(x + R)-G(x-R) = ^^±=. (9) 

Now, redefine x — > x — R, A to handle all the constants, and simplify the recursive equation 
to give 






T'(x - nR) 



G{x) = A 2. , Ti k > ( 10 ) 

nGOdds V T ( X ~ nR ) 

assuming G (oo) — )■ and the sum converges. Formally, this expression correctly gives G in 
terms of T; however, there are some significant practical considerations. To understand the 
source of the problem, consider a Gaussian target distribution: 

T(x)oce~^ (11) 

Then, plugging into equation [TO], G is 

G(x) = A J2 (nR~x)e — &~ . (12) 

nGOdds 

The problem with this expression can be seen by imagining trying to normalize G. Each 
term of the sum integrates to zero, but the whole thing must not integrate to zero since G is 
a probability distribution. This is a problem with commuting the two infinities, in the sum 
and the integral, shown by numeric summation of the first few terms (figured]). There is a 
negative bubble that moves off to infinity as the number of terms increases, and a Gaussian 
like peak that remains near zero. Thus, the limit of the sum must be taken before that of 
the integral. 

Alternatively, since the negative bubble always moves to plus infinity, we can rely on the 
fact that G(x) is even because T(x) is even, and only look at x < 0. While numerically such 
a process works fine, it complicates the integrals considerably as far as trying to approximate 
G analytically. 
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FIG. 1. (color online) The sum in equation 1121 for a Gaussian target distribution with a = 1 and 
R = 1 with the infinite sum cutoff at 1, 2, 3, 4, and 5 terms. As the number of terms is increased 
the negative bubble moves off to infinity leaving only the Gaussian shaped peak near the origin. 
The dashed black line is the best fit Gaussian (ex = 1.3(1)) with 500 terms. Note that the fit 
Gaussian has so much error because the guess distribution is not exactly a Gaussian when R ^ 0. 

To make some headway analytically, consider a Laplace transform approach. For con- 



venience, define q(x) = A^T(x), and then the bilateral Laplace transform of equation [7] 

is 



dxe~ 



dtG(t) = q(x) 



Ix-R 



sinh {Rs)~ 
G{s) oc q(s), 



(13) 



where the sinh (Rs) / s prefactor can be derived easily using the same boxcar trick as before 
(equation |3]). 

A. Box car target distribution 



The Laplace transform can be very difficult to invert, so equation [13] is not always use- 
ful; however, it does work for special cases of boxcar target distributions and Gaussian 



distributions with small R. First, consider the boxcar 



1 —Q < x < Q, 

T{x) oc { (14) 

otherwise. 



Up to a constant factor, the square root of a boxcar is also a boxcar, so 

rQ 

q(s) oc / dxe~ sx 

J-o 



(15) 
sinh (Qs) 



Then, from equation 



~ sinh(Qs) 

G{s) x ^mrT) ' (16) 

For the special case R = Q, everything cancels and G(s) oc 1 immediately gives G(t) = 5(0) 
as expected. By looking at the center of the distribution the search radius covers all possible 
targets and is clearly optimal. 

Similarly, for R = Q/2, we can apply the hyperbolic identity sinh2x = 2 sinh x cosh x to 
give after minor simplification 

G(s) oc e Rs + e~ Rs , (17) 

and the inverse is G[x) = 1/2 [S(x + R) + S(x — R)]. Again, this perfectly tiles the search 
space with no overlap and is the best search strategy. All such integer fractions of Q can 
be solved, and they similarly generate a perfectly tiled cover of the interval [-Q, Q] (figure 

W). 

B. Gaussian target distribution 

Another case of great interest is the Gaussian target distribution. In that case, notice that 
xj sinh x itself looks like a Gaussian, so we can approximate it as a Gaussian by matching 

2 

variances. The integral of x/sinhx is standard, and the normalized second moment is -^. 

Thus, the the useful approximation is a Gaussian with a matching second moment 

x 1 r 2 * 2 /.^ 

ps — e ^~. (18) 

smh(Rx) R K J 

Finally, plugging in the square root of a Gaussian with standard deviation o T for q leaves 
(dropping all coefficients as usual) 



. 2 
T 2 _R_ 2 



G(x)oce Vr-^J (19) 



for <jt > R/tt (figure [3]A.,D). For R/n > ar the approximation breaks down. Thus, the 
optimal guess distribution is approximately a Gaussian with standard deviation oq ~ 



\/2W(jj — ^2 in good agreement with numerical estimates (figure [2]). 




0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 

R (units of a T ) 



FIG. 2. This is the standard deviation of the optimum guess oq versus the search radius R. 
The dots are numerically calculated by assuming the guess distribution is a Gaussian and directly 
optimizing (n). The line is the analytical result using the Gaussian approximation for x/sinhx. 
In both cases the target distribution is a Gaussian with units set by choosing cry = 1. 



C. Convolution 



As a further example highlighting the non-local nature of the search, consider the special 
case where the target distribution can be written in the form of a convolution squared: 

1 2 



T(x) (x 



/oo 
drg(T)f(t-r) 
-oo 



(20) 



where / and g are arbitrary functions such that T is a probability distribution. Then, since 
the Laplace transform of the convolution is the product of the Laplace transforms, 



G(s) oc 



sinh (Rs) 



9(s)f(s) 



(21) 



Now we can combine the boxcar and Gaussian functions from before to highlight the inter- 
esting, non-local behavior. Let g be a boxcar of width nR for n > an integer as before. 
Then, 



sinh (nRs) ~ 
G(s) oc —^-f(s) 



(22) 



where the sum is over X{ = R + IRi — nR and i G Z n to tile the interval from [—nR, nR] 
evenly with subintervals of length 2R. Next, exp(as) is the Laplace transform of 5(x — a) so 



G(x) oc22f{x-Xj) 



(23) 



2 

This can lead to some interesting effects. For example, set / oc exp — j-% and let Bq(x) be 
a boxcar of radius Q centered at x, then the target function is 



T(x) oc 



n 2 



drB Q (t- r)e 57* 

-oo 

t+Q 2 

dre 2^ 



(24) 
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erf 



Q — x 



erf 



/Q 



a; 



\f2a ) \ V2a 

which looks like to a Gaussian with a peak at zero. The guess distribution, on the other 

hand, has two peaks at ±R, and the most likely place to search does not coincide with the 

most likely target location (figure EC) . 



III. DISCRETE SYSTEMS 



While the search technique introduced here is very powerful for continuous systems, it 
also applies to discrete systems for which the target can only be found by looking at its 
site. These can be considered a special case of continuous systems, or more simply, write 
the discrete version of equation [1] for a set of discrete sites A: 

t; 



<-> = £ 



9i 



(25) 



where tj are the target probabilities and <?j the guess probabilities. Then, as one could guess 
from the general result for small search radius (equation |8]), the optimum choice of gi given 
ti is gi = AyJTi where A = 1/ £\ y^ maintains normalization. 
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FIG. 3. (color online) Some interesting sample target distributions (black, dot-dashed) are shown 
with their associated guess distributions (red, solid, dotted, and long dashed) and the square root 
approximation (blue, short dashed). Part A is shows a Gaussian distribution with cry = 1 to set the 
units and R G {0.5, 1.5, 2.5}, and is calculated using equation[19l Part B shows a super exponential 
exp — (x/3) to approximate a boxcar with R € {0.1,1.1,2.1} calculated from the numeric sum 
(equation [T0|) . The optimum guess distributions approaches the delta function solutions for a true 
boxcar. Part C uses a target distribution of a Gaussian with or = 1 convolved with a boxcar 
Bq(0). As R £ {2,3,6} increases peaks appear, highlighting the non-local nature of the search. 
For R = 3 the most likely search points are at ±3, but the target is most likely at zero. Part D 
shows a two dimensional Gaussian target distribution (black) and the optimum guess distribution 
for R = 0.5. For clarity the normalization is such that the peak heights match. 

A discrete set of isolated sites is quite simple, but a system of great interest and richness is 



networks 
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, Il6|. On networks, one can imagine a searcher with random access to the nodes 
that is able to look at any node and out along the node's edges to its nearest neighbors. 
This is akin to typing in a web address and checking the links on the resulting page for the 
target, rather than just following links from page to page. In that case, the probability of 



10 



finding a single target at any given node u depends on the in-degree of the node, for each 
of its neighbors, plus one, for the node itself: d u + 1, and the mean search time for a target 
randomly hidden on some fixed node is 

{n) ^^T-J,' (26) 

where i is a node, and the sum in the denominator is over all nodes that have an edge pointing 
to i. The mean field assumption here is that the probability that the target is hidden at 



node i is equal to the probability of finding the target at node i so that gi oc wl + d on t,i, 
but it can fail spectacularly for graphs. As a simple example, consider a star graph with one 
node connected to 5 independent nodes (figure HI left), and hide the target on that graph 
with equal probability at each node. Then, the probability of finding the target by looking 
at the center node is 1, but the probability the target is hidden at the center node is only 
1/6. 

The star graph (figure HJ left) also points to an heuristic solution of the search problem 
because the ideal strategy is certainly to look only at the center node and not at the others, 
finding the target in one try every time. But, the simple approach, ignoring the graph 
structure, would have a finite probability of looking at the boundary nodes. Thus, the 
heuristic proposed here is to start from the square root of the out-degree rule and iteratively 
remove nodes from consideration if such a removal improves the search time: look at a node 
u with probability either -^/l + d out<u or zero. Note that this is not very efficient since the 
setup time for a search is at least order the size of the graph so if the search only happens 
once, it would be just as good to just check every node. If there are many searches on a 
static network, then this algorithm may be advantageous, but it is more likely that the best 



search would combine the usual random search [17[ with the strategy presented here as a 



starting point or, for an example on the internet, to choose places to randomly jump in the 



popular PageRank algorithm 18 1. 



Since the proposed solution is only an heuristic, consider a direct Monte Carlo minimiza- 
tion (figure [5]) of the mean search time (equation 1261) . The simulation starts with a random 
assignment of the probability to look at each node. Then, at each time step one of the nodes 
is selected and changed by a random amount bounded by a step size, S, and constrained 
to stay in the interval [0, 1]. For a trial step, the change in mean search time is calculated 
from equation [261 an d steps are accepted with the usual Monte Carlo Boltzmann weight: 

11 




FIG. 4. (color online) These are two sample graphs. On the left, is a simple 5-star graph, and the 
obvious best strategy is to look at the center node (large, blue) where any target is found in exactly 
one step. On the right is the adjnoun network[19] with nodes to be looked at labeled blue, larger, 
and ones that should be ignored in red, smaller. The blue nodes are found using the heuristic 
algorithm described in the main part. 



exp (—A (n) /T). The initial step size is 5q = 0.01, and the initial temperature is T = 0.01 
which is large compared to the usual change in probability. After 1000 steps without finding 
a smaller minimum (n), the temperature and step size are both lowered by a factor of two. 
The simulation stops after at most 10 million iterations, T < 10 -12 , or S < 10 -12 . Figure 
|5] shows the minimization for six graphs from freely available databases 19H21J . For larger 
graphs the convergence is poor, but the split between either the square root rule or nothing 
is apparent. Table U shows the average search time for various publicly available data sets. 
A simple model looking at nodes with probability proportional to the out degree plus one 
(with no square root) is included for comparison when the Monte Carlo optimization is not 
feasible. 



IV. CONCLUSION 



In conclusion, a searcher that has random access to possible target positions and is 
trying to locate only a single target should sample a guess distribution as described here. 
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Data set 


Size 


Probability match 


Monte Carlo 


Heuristic 


www [2 2] 


1795408 


62600 ± 400 


na 


33200 ± 400 


adjnoun[19] 


962 


17.4 ±0.3 


9.5 ±0.1 


10.1 ±0.2 


as-22july06[23] 


119835 


2120 ± 60 


710 ±10 


760 ± 20 


astro-ph[20] 


258548 


6200 ± 200 


1220 ± 20 


1420 ± 20 


celegansneur al [2 1 ] 


2642 


40.1 ±0.7 


35.4 ±0.5 


41.1 ±0.7 


cond-mat[20] 


111452 


4700 ± 100 


1930 ± 24 


2120 ± 30 


cond-mat-2003[20] 


270518 


9000 ± 200 


2850 ± 40 


3180 ± 40 


cond-mat-2005[20] 


390963 


11000 ± 300 


na 


3830 ± 60 


hep-th[20] 


39112 


2710 ± 14 


na 


1440 ± 20 


netscience[19j 


6945 


585 ±3 


na 


316 ±4 


polblogs[24] 


20246 


234 ±3 


195 ±1 


222 ±4 


power [21J 


18129 


1518 ±6 


na 


1210 ±10 



TABLE I. This is a table of the average search time for a randomly hidden target on the specified 
network. Probability match looks at nodes with probability proportional to their out degree plus 
one as a default comparison, Monte Carlo is a minimization as described in the text, and heuristic 
is the square root or nothing heuristic described in the text. Errors are all estimated by running 
the search 10, 000 times for the calculated guess distribution and a random target. 



If the searcher must be very close to a target continuously distributed in any dimension 
to locate it, then the guess distribution is approximately the square root of the target 
distribution. Otherwise, the guess distribution is more difficult to estimate, but can be 
numerically evaluated with sufficient care. On discrete networks an heuristic approach is to 
either look at a node with probability proportional to the square root of its out degree plus 
one or not at all. Determining a fast, local algorithm to decide whether or not to look at 
a node remains, but the search times from the heuristic compare favorably to those from a 
Monte Carlo estimation of the optimal search. 
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FIG. 5. (color online) Monte Carlo minimization of the search time for various publicly available 
networks. The points are the guess distribution versus the out degree plus one for each node. The 



line has slope 1/2. The guess probability either heads to zero (the lower cluster) or \/I + d out . See 
table U for references. 
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